Goto

Collaborating Authors

 anchor function




Anchor Function

Neural Information Processing Systems

Figure 7: Actual example of how an anchor function impacts the generated solution. In this section, we provide additional experimental details and results for the experiments in Section 3. We include additional details for anchoring (Appendix A.1), the availability heuristic (Appendix A.3), Filtering prompts for longer canonical solutions. However, all components of the prompts from Section 3.3.2 We plot the analogous add-var results in Figure 10 and include full numerical results in Table 7. In this section, we augment Section 3.3.3



Anchor Function

Neural Information Processing Systems

Figure 7: Actual example of how an anchor function impacts the generated solution. In this section, we provide additional experimental details and results for the experiments in Section 3. We include additional details for anchoring (Appendix A.1), the availability heuristic (Appendix A.3), Filtering prompts for longer canonical solutions. However, all components of the prompts from Section 3.3.2 We plot the analogous add-var results in Figure 10 and include full numerical results in Table 7. In this section, we augment Section 3.3.3



Function Extrapolation with Neural Networks and Its Application for Manifolds

Hay, Guy, Sharon, Nir

arXiv.org Artificial Intelligence

This paper addresses the problem of accurately estimating a function on one domain when only its discrete samples are available on another domain. To answer this challenge, we utilize a neural network, which we train to incorporate prior knowledge of the function. In addition, by carefully analyzing the problem, we obtain a bound on the error over the extrapolation domain and define a condition number for this problem that quantifies the level of difficulty of the setup. Compared to other machine learning methods that provide time series prediction, such as transformers, our approach is suitable for setups where the interpolation and extrapolation regions are general subdomains and, in particular, manifolds. In addition, our construction leads to an improved loss function that helps us boost the accuracy and robustness of our neural network. We conduct comprehensive numerical tests and comparisons of our extrapolation versus standard methods. The results illustrate the effectiveness of our approach in various scenarios.


Anchor function: a type of benchmark functions for studying language models

Zhang, Zhongwang, Wang, Zhiwei, Yao, Junjie, Zhou, Zhangchen, Li, Xiaolong, E, Weinan, Xu, Zhi-Qin John

arXiv.org Artificial Intelligence

Understanding transformer-based language models is becoming increasingly crucial, particularly as they play pivotal roles in advancing towards artificial general intelligence. However, language model research faces significant challenges, especially for academic research groups with constrained resources. These challenges include complex data structures, unknown target functions, high computational costs and memory requirements, and a lack of interpretability in the inference process, etc. Drawing a parallel to the use of simple models in scientific research, we propose the concept of an anchor function. This is a type of benchmark function designed for studying language models in learning tasks that follow an "anchor-key" pattern. By utilizing the concept of an anchor function, we can construct a series of functions to simulate various language tasks. The anchor function plays a role analogous to that of mice in diabetes research, particularly suitable for academic research. We demonstrate the utility of the anchor function with an example, revealing two basic operations by attention structures in language models: shifting tokens and broadcasting one token from one position to many positions. These operations are also commonly observed in large language models. The anchor function framework, therefore, opens up a series of valuable and accessible research questions for further exploration, especially for theoretical study.


Capturing Failures of Large Language Models via Human Cognitive Biases

Jones, Erik, Steinhardt, Jacob

arXiv.org Artificial Intelligence

Large language models generate complex, open-ended outputs: instead of outputting a class label they write summaries, generate dialogue, or produce working code. In order to asses the reliability of these open-ended generation systems, we aim to identify qualitative categories of erroneous behavior, beyond identifying individual errors. To hypothesize and test for such qualitative errors, we draw inspiration from human cognitive biases -- systematic patterns of deviation from rational judgement. Specifically, we use cognitive biases as motivation to (i) generate hypotheses for problems that models may have, and (ii) develop experiments that elicit these problems. Using code generation as a case study, we find that OpenAI's Codex errs predictably based on how the input prompt is framed, adjusts outputs towards anchors, and is biased towards outputs that mimic frequent training examples. We then use our framework to elicit high-impact errors such as incorrectly deleting files. Our results indicate that experimental methodology from cognitive science can help characterize how machine learning systems behave.


MetaAnchor: Learning to Detect Objects with Customized Anchors

Yang, Tong, Zhang, Xiangyu, Li, Zeming, Zhang, Wenqiang, Sun, Jian

Neural Information Processing Systems

We propose a novel and flexible anchor mechanism named MetaAnchor for object detection frameworks. Unlike many previous detectors model anchors via a predefined manner, in MetaAnchor anchor functions could be dynamically generated from the arbitrary customized prior boxes. Taking advantage of weight prediction, MetaAnchor is able to work with most of the anchor-based object detection systems such as RetinaNet. Compared with the predefined anchor scheme, we empirically find that MetaAnchor is more robust to anchor settings and bounding box distributions; in addition, it also shows the potential on transfer tasks. Our experiment on COCO detection task shows that MetaAnchor consistently outperforms the counterparts in various scenarios.